9 research outputs found

    Visual perception of liquids: Insights from deep neural networks

    Get PDF
    Visually inferring material properties is crucial for many tasks, yet poses significant computational challenges for biological vision. Liquids and gels are particularly challenging due to their extreme variability and complex behaviour. We reasoned that measuring and modelling viscosity perception is a useful case study for identifying general principles of complex visual inferences. In recent years, artificial Deep Neural Networks (DNNs) have yielded breakthroughs in challenging real-world vision tasks. However, to model human vision, the emphasis lies not on best possible performance, but on mimicking the specific pattern of successes and errors humans make. We trained a DNN to estimate the viscosity of liquids using 100.000 simulations depicting liquids with sixteen different viscosities interacting in ten different scenes (stirring, pouring, splashing, etc). We find that a shallow feedforward network trained for only 30 epochs predicts mean observer performance better than most individual observers. This is the first successful image-computable model of human viscosity perception. Further training improved accuracy, but predicted human perception less well. We analysed the network’s features using representational similarity analysis (RSA) and a range of image descriptors (e.g. optic flow, colour saturation, GIST). This revealed clusters of units sensitive to specific classes of feature. We also find a distinct population of units that are poorly explained by hand-engineered features, but which are particularly important both for physical viscosity estimation, and for the specific pattern of human responses. The final layers represent many distinct stimulus characteristics—not just viscosity, which the network was trained on. Retraining the fully-connected layer with a reduced number of units achieves practically identical performance, but results in representations focused on viscosity, suggesting that network capacity is a crucial parameter determining whether artificial or biological neural networks use distributed vs. localized representations

    Perceived object motion variance across optical contexts

    Get PDF
    Visual motion computation is challenging under real-world conditions due to continuous contextual changes such as varying lighting conditions and a large range of optical material properties. Due to these changes the retinal optical flow can drastically vary while the physical motion of an object remains constant. Especially materials with high reflective and refractive interactions can cause complex motion patterns. Here we investigate object motion constancy across various optical contexts and if the human visual system compensates for other causal sources in motion.We performed two experiments. In the first experiment observers had to estimate which of two stimuli was rotating faster around the vertical axis. The stimuli were displayed for 500 ms in a 2-IFC staircase design. For the Match stimulus the illumination, material properties and shape were constant. The stimulus was rendered at a high temporal resolution allowing for small rotational speed changes for the staircase design. The Test stimuli varied in ten optical properties (e.g., matte, glossy, anisotropic, translucent), three illumination maps (sunny, cloudy, indoor), and three shapes (knot, cubic, blobby), the rotational speed remained constant. There were three different conditions in the second experiment: 1. unmasked Match and Test stimulus (same as experiment one); 2. masked Test stimulus (circular gaussian mask, masking outer shape contours); 3. masked Test stimulus and masked Match stimulus where the Match stimulus was replaced by horizontally moving 2D pink noise. In this experiment a subset of the optical conditions was used.Expanding on our previously presented work [1], we applied three image-based motion capturing models (Figure 1) to gain deeper insights on motion cues that are predictive of human judgements. The models are Lucas-Kanade (optical flow), RAFT (optical flow DNN), FFV1MT (motion energy). First, we found that there are clear illusory differences of perceived rotational speed with even bigger effects when the circular mask was applied. The transparent material with the refractive index of water is systematically perceived to be rotating faster than other materials across all conditions. We performed an RSA (representational similarity analysis) to compare a range of different metrics across conditions and flow models. We find that the gradient of the optical flow is a particularly good predictor of human performance. The gradient emphasizes local speed changes in the optical flow, for example with moving highlights. Another observation is that Lucas-Kanade is most predictive of human performance under most conditions while RAFT is most stable across materials and closest to the ground truth. Our results further suggest that the human visual system does partially compensate for motion flow effects across optical contexts in object motion.[1] Van Assen, J. J. R., Kawabe, T., & Nishida, S. Y. (2020). Object motion and flow variance across optical contexts. Journal of Vision, 20(11), 458-458.This work has been supported by a Marie-SkƂodowska-Curie Actions Individual Fellowship (H2020-MSCA-IF-2019-FLOW) and by JSPS Kakenhi JP20H05957

    The Influence of Optical Material Properties on the Perception of Liquids

    No full text
    Dataset relative to the following publication: Jan Jaap R. van Assen, Roland W. Fleming (2016). Influence of optical material properties on the perception of liquids. Journal of Vision, 16(15):12, 1–20. doi: 10.1167/16.15.12. One zip file contains the datasets of the various experiments. The second zip file contains the liquid stimuli used during the experiments

    Visual perception of liquids: insights from deep neural networks

    No full text

    Highlight shapes and perception of gloss for real and photographed objects

    No full text
    Gloss perception strongly depends on the three-dimensional shape and the illumination of the object under consideration. In this study we investigated the influence of the spatial structure of the illumination on gloss perception. A diffuse light box in combination with differently shaped masks was used to produce a set of six simple and complex highlight shapes. The geometry of the simple highlight shapes was inspired by conventional artistic practice (e.g., ring flash for photography, window shape for painting and disk or square for cartoons). In the box we placed spherical stimuli that were painted in six degrees of glossiness. This resulted in a stimulus set of six highlight shapes and six gloss levels, a total of 36 stimuli. We performed three experiments of which two took place using digital photographs on a computer monitor and one with the real spheres in the light box. The observers had to perform a comparison task in which they chose which of two stimuli was glossiest and a rating task in which they rated the glossiness. The results show that, perhaps surprisingly, more complex highlight shapes were perceived to produce a less glossy appearance than simple highlight shapes such as a disk or square. These findings were confirmed for both viewing conditions, on a computer display and in a real setting. The results show that variations in the spatial structure of “rather simple” illumination of the “extended source” type highlight influences perceived glossiness.Human Information Communication Desig

    Highlight shapes and perception of gloss for real and photographed objects

    No full text

    Shape, motion and optical cues to stiffness of elastic objects

    No full text
    Dataset relative to the following publication: Paulun, V.C., Schmidt, F., van Assen, J.J.R., & Fleming, R. W. (2017). Shape, motion and optical cues to stiffness of elastic objects. Journal of Vision, 17(1):20, 1-20, doi:10.1167/17.1.20 Each folder contains the data and stimulus material relative to one experiment and a text file with comments

    Visual Features in the Perception of Liquids

    No full text
    Perceptual constancy—identifying surfaces and objects across large image changes—remains an important challenge for visual neuroscience. Liquids are particularly challenging because they respond to external forces in complex, highly variable ways, presenting an enormous range of images to the visual system. To achieve constancy, the brain must perform a causal inference that disentangles the liquid’s viscosity from external factors—like gravity and object interactions—that also affect the liquid’s behavior. Here, we tested whether the visual system estimates viscosity using “midlevel” features that respond more to viscosity than other factors. Observers reported the perceived viscosity of simulated liquids ranging from water to molten glass exhibiting diverse behaviors (e.g., pouring, stirring). A separate group of observers rated the same animations for 20 midlevel 3D shape and motion features. Applying factor analysis to the feature ratings reveals that a weighted combination of four underlying factors (distribution, irregularity, rectilinearity, and dynamics) predicted perceived viscosity very well across this wide range of contexts (R2 = 0.93). Interestingly, observers unknowingly ordered their midlevel judgments according to the one common factor across contexts: variation in viscosity. Principal component analysis reveals that across the features, the first component lines up almost perfectly with the viscosity (R2 = 0.96). Our findings demonstrate that the visual system achieves constancy by representing stimuli in a multidimensional feature space—based on complementary, midlevel features—which successfully cluster very different stimuli together and tease similar stimuli apart, so that viscosity can be read out easily.Perceptual Representation of Illumination, Shape and Materia
    corecore